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Impact Evaluation of Reading /-Ready for Striving Learners Using 2018-19 
Data 


Key Findings 


e Striving learners that used /-Ready in treatment schools performed better on reading 
achievement than similar students in comparison schools who did not use /-Ready. 


- Onaverage, all striving learners showed gains in reading achievement between 
fall of 2018 and spring of 2019; moreover, those who used /i-Ready showed 
significantly greater gains in student achievement. 


e A subset of striving learners, defined as those who performed at or below the 20" 
percentile of reading achievement at baseline, that used i-Ready in treatment schools 
performed better on reading achievement than striving learners in comparison schools 
who did not use /-Ready. 


- On average, students who performed at or below the 20" percentile of reading 
achievement showed gains in reading achievement between fall of 2018 and 
spring of 2019; moreover, those who used /-Ready showed significantly greater 
gains in student achievement. 


e Black or African American striving learners that used i-Ready in treatment schools 
performed better on reading achievement than Black or African American striving 
learners who did not use j-Ready in comparison schools. 


- Black or African American striving learners experienced similar benefits from i-Ready 
use as non-Black or African American striving learners. 


- Onaverage, all Black or African American striving learners showed gains in 
reading achievement between fall of 2018 and spring of 2019; moreover, those 
who used /-Ready showed significantly greater gains in student achievement. 


e Striving learners of Hispanic origin that used /-Ready in treatment schools performed 
better on reading achievement than striving learners of Hispanic origin in comparison 
schools who did not use /-Ready. 


- Striving learners of Hispanic origin in grades 2—4 experienced similar benefits from /- 
Ready use as striving learners not of Hispanic origin. 


- Striving learners of Hispanic origin in grade 5 showed even greater benefits of /- 
Ready use on reading achievement compared to striving learners not of Hispanic 
origin. 

- Onaverage, all striving learners of Hispanic origin showed gains in reading 
achievement between fall of 2018 and spring of 2019; moreover, those who used /- 
Ready showed significantly greater gains in student achievement. At grade 5, 
striving learners of Hispanic origin who used i-Ready showed even greater gains in 
reading achievement than the students not of Hispanic origin who used /-Ready. 
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Abstract 


Curriculum Associates’ i-Ready® Personalized Instruction (i-Ready) is a supplemental, online 
personalized instruction program available for reading and mathematics‘. Prior research has 
indicated i-Ready has a positive impact on K-8 student achievement for students overall (e.g., 
Swain, Randel, & Norman Dvorak, 2020). The present study furthers that work by examining the 
impacts of i-Ready for striving learners specifically, to provide schools and districts with more 
targeted information on its effectiveness for these struggling students. The Human Resources 
Research Organization (HumMRRO), in collaboration with Century Analytics, implemented a 
quasi-experimental design (QED) using academic year 2018-19 i-Ready data to evaluate the 
impact of i-Ready reading instruction on student reading achievement for striving learners in 
grades 2-5 on a nationally normed cognitive assessment. Two populations of striving learners 
were examined at each grade — those who tested two or more grade levels below their current 
grade in reading at baseline and a subset of these students who fell at the bottom 20" 
percentile of reading achievement. The percentiles were based on reading achievement 
measured by the i-Ready® Diagnostic (Diagnostic) at baseline. It was hypothesized student 
achievement, as measured by the Diagnostic, would be higher for striving learners using /- 
Ready for reading over comparison groups of students who did not use this instruction. 
Exploratory analyses examined whether the findings for the striving learners were consistent for 
Black or African American striving learners and those of Hispanic origin. Matching was 
conducted at each grade level to meet two needs: 1) identify a set of comparison schools 
demographically similar to our i-Ready schools, and 2) identify a set of academically equivalent 
comparison students within the matched comparison schools. Students who received i-Ready 
and students in the comparison group took the reading version of the Diagnostic assessment. 
To estimate impacts, hierarchical-linear modeling (HLM) was conducted separately for each 
grade level with students at level 1 and schools at level 2. This process was conducted for the 
full sample of striving learners and again for the subsample of students at the bottom 20" 
percentile. Results suggest both the striving learners and students at the bottom 20" percentile 
using !-Ready with fidelity in the treatment schools performed statistically significantly better on 
reading than students in the comparison schools who did not use this instruction. The effect 
sizes fell within the range which recent research characterizes as modest for an education 
intervention (Kraft, 2019). These findings provide support that /-Ready for reading used with 
fidelity in schools can lead to higher reading achievement for striving learners. Exploratory 
analyses found that these impacts were consistent for the Black or African American striving 
learners and for striving learners of Hispanic origin at grades 2—4. A positive Hispanic origin by 
treatment group interaction was present at grade 5, indicating -Ready had greater impacts on 
reading achievement of striving learners of Hispanic origin as compared to striving learners not 
of Hispanic origin who used /-Ready. 


1 https:/Awww.curriculumassociates.com/products/i-ready 
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Introduction 


For more than 50 years, Curriculum Associates has provided educational products and services 
with the goal of improving education for students and teachers. They provide various 
assessment and instructional resources and professional development for reading and 
mathematics. One available product is the i-Ready® Diagnostic (Diagnostic), available for 
grades K-12. The Diagnostic assessments, typically taken in the fall, winter, and spring of a 
given academic year, are (a) online, computer-adaptive assessments that pinpoint student 
needs at the sub-skill level and (b) help monitor the extent to which students are on track to 
achieve end-of-year targets. The Diagnostic assessments are independent measures often 
used by educators as interim classroom benchmark assessments. Another product is i-Ready® 
Personalized Instruction (i-Ready), available for grades K—8. i-Ready is personalized instruction 
included with Curriculum Associate’s i-Ready Learning products. The instruction provided to 
students is driven by performance on the Diagnostic and provides tailored instruction that meets 
students’ needs and encourages the development of new skills. 


i-Ready is intended for students of all ability levels. Previous research provides evidence of its 
effectiveness in reading and mathematics when considering K—8 students overall (Swain et. al., 
2020). However, Curriculum Associates understands many schools are interested in education 
programs that are proven to be effective with select groups of students, including striving 
learners and students from traditionally disadvantaged backgrounds. Identifying successful 
online learning options for struggling students is particularly relevant in the age of virtual 
learning, as schools develop virtual options that may need to be implemented for the 2020-— 
2021 school year and beyond. The primary purpose of this study was to examine the impact of 
i-Ready on reading achievement for striving learners in elementary grades 2-5 using 2018-19 
data. Because achievement gaps in reading are often prevalent for Black or African American 
students and students of Hispanic origin (Stanford CEPA, n.d.), a secondary purpose was to 
examine if i-Ready had a differential impact on Black or African American striving learners or 
striving learners of Hispanic origin. 


The research was conducted by the Human Resources Research Organization (HumRRO) and 
Century Analytics. HumRRO is an independent research organization that specializes in 
program evaluation and quantitative methodology. Century Analytics is a small business with 
various education research expertise including quasi-experimental design and What Works 
Clearinghouse (WWC) standards. HumRRO and Century Analytics designed the study to meet 
the required rigor of the WWC 4.1 standards to achieve a rating of Meets WWC Group Design 
Standards with Reservations (WWC, 2020a), and to meet guidelines for a Level 2 (or Moderate) 
rating for the Every Student Succeeds Act (ESSA) guidance for evidence-based research (U.S. 
Department of Education, 2016). To accomplish this, we used a quasi-experimental design 
(QED), established baseline equivalence between the treatment and comparison groups, 
included baseline achievement as a covariate, and used a sampling design that mitigates the 
effects of any confounding factors. 


Defining i-Ready Implementation 


The impact of /-Ready on student achievement was the focus of this evaluation. i-Ready is an 
online personalized instruction program aligned to college- and career-readiness standards that 
includes engaging multimedia instruction and progress monitoring of online lessons. Lessons 
are intended to provide a consistent best practice lesson structure and build students’ 
conceptual understanding. i-Ready is intended to be used in conjunction with the Diagnostic 
which monitors student progress and identifies student performance in reading and 
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mathematics. This diagnostic information helps target student-specific intervention, which can 
be provided through /-Ready. 


Curriculum Associates has identified key implementation components of i-Reaay that highlight 
actions recommended by students, teachers, and leaders to obtain the long-term outcome of 
improved student learning in reading and mathematics. Among others, the key components include 
support at the school and district leadership levels, monitoring of student progress by teachers, and 
student use of /-Ready to work through a personalized, scaffolded instruction path. 


Curriculum Associates provides guidance to districts and schools on how to implement /-Ready 
to best benefit student learning (Curriculum Associates, 2019). Guidance indicates students 
achieve greater gains when using /-Ready for an average of between 30-49 minutes of lesson 
time-on-task per week, per subject area. In addition, Curriculum Associates recommends use 
for at least 12 to 18 calendar weeks between administrations of the Diagnostic (Curriculum 
Associates, 2018). 


Research Questions 


The primary purpose of this study was to estimate the impact of Curriculum Associates’ /- 
Ready on student reading achievement for striving learners in grades 2-5. Striving learners were 
defined as those who tested two or more grade levels below their current grade at baseline. The 
following confirmatory research question was addressed: 


e What is the impact of i-Ready usage on student reading achievement for striving 
learners in schools that implement /-Ready compared to striving learners in schools that 
implement the Diagnostic only? 


In addition, a second research question sought to examine the impact of i-Ready on student 
achievement for a subset of grade 2-5 striving learners that fell in the bottom 20" percentile of 
reading achievement. The following second confirmatory research question was addressed: 


e What is the impact of i-Ready usage on student reading achievement for striving 
learners at the bottom 20" percentile in schools that implement i-Ready compared to 
these striving learners in schools that implement the Diagnostic only? 


In addition to the main research questions, we sought to understand whether the main effects were 
representative of the experiences with /-Ready for Black or African American striving learners and 
striving learners of Hispanic origin. We addressed the following exploratory questions: 


e Do Black or African American striving learners experience similar impacts of i-Ready use 
on student reading achievement compared to striving learners overall? 


e Do striving learners of Hispanic origin experience similar impacts of i-Ready use on 
student reading achievement compared to striving learners overall? 


Methodology 


In this section, we describe the methodology for conducting our impact analyses. We begin with 
initial design decisions. We then discuss the matching process to achieve baseline equivalence 
and the analytic model. In the subsequent section, we discuss our impact and exploratory 
analysis results. 
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Design 
Eligible Schools and Students 


For each grade, we started with a student-level /-Ready usage file of students in public schools with 
Diagnostic and /-Ready use in 2018-19 who had at minimum fall and spring Diagnostic scores. By 
including only public schools, we sought to include only students in a relatively traditional school 
environment with expectations to follow state adopted college and career ready standards. 


For a student within a treatment school to be eligible for inclusion, they must have used /j-Ready 
for reading a minimum of 18 distinct weeks for an average of at least 30 minutes per week 
(Curriculum Associates, 2018). This was consistent with guidance on the minimum /-Ready 
usage at the student-level for attaining intended goals of improved student reading 
achievement. Treatment schools were only included if they began using /-Ready to some extent 
prior to the 2018-19 school year. This requirement is based on the understanding that i-Ready 
implementation, like the implementation of most new programs, requires a start-up time to learn 
the technology and adjust to the schedule before /-Ready is fully implemented. To be eligible for 
inclusion as a student in a comparison school, students must not have used any /-Ready for 
reading in 2018-19. We removed from the datafile students not meeting the treatment or 
comparison eligibility requirements when matching students to the two groups. 


Prior to the onset of this study, we defined a striving learner as one who tested two or more 
grade levels below their current grade at baseline. Each student is assigned a grade- 
classification based on their Diagnostic score. Only students assigned a classification of two or 
more grade levels below their current grade were included in our study. For example, a grade 2 
student was included only if they classified at a kindergarten level and a grade 5 student was 
included only if they classified as levels K—-3. We also identified students at the lowest 20" 
percentile of reading achievement at baseline as a subset of striving students to examine. 
These students were of interest as they may require intensive academic intervention. Only 
students who met these definitions were included in our study. 


Unit of Assignment 


HumRRO and Century Analytics completed investigations to identify the unit of assignment— 
either school-level or student-level—for the sample of striving learners. Because we understand 
there are differences by grade-level, we conducted these investigations separately by grade. 
Using Curriculum Associates usage data, we identified the number of schools for which there 
were students in (a) only the treatment or the comparison group and (b) both the treatment and 
comparison groups. Across grades, 92.7% of students attended schools with students classified 
as only treatment or comparison; thus, we decided that school was the appropriate unit of 
assignment for investigating the impact of -Ready on the achievement of striving learners. 
Separately by grade, we excluded the small percentage of schools with some students classified 
as treatment and other students classified as comparison from our school-level assignment study. 


Baseline and Outcome Measure 


We selected the Diagnostic as both the baseline and outcome measure for all students participating 
in this study (i.e., Ready students and comparison group students). The Diagnostic for reading 

measures achievement aligned to common reading content and skills with demonstrated test score 
reliability. Marginal reliabilities range from 0.96 to 0.97 and test-retest reliabilities range from 0.85 to 
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0.86 for reading grades 2 through grade 5. Therefore, this assessment meets the WWC 4.1 
standards for an acceptable baseline and outcome measure (WWC, 2020a). 


The Diagnostic assessments align to college and career ready standards so that results can 
inform student placement decisions, offer explicit instructional advice, and prescribe resources 
for targeted instruction and intervention. The assessments are used by some schools and 
districts in conjunction with /-Ready and by others as a stand-alone diagnostic assessment 
without the use of i-Ready. The Diagnostic assessments for reading and mathematics are 
currently used by approximately eight million, or nearly 25%, of K-8 students across the United 
States. Thus, the use of Diagnostic as the outcome measure allowed us to include a large 
sample of students from across the United States. The Diagnostic is intended to be 
administered in a standardized manner across schools (Curriculum Associates, 2019b). 
Specifically, teachers of students in the studied grades 2—5 are to schedule the first (fall) 
Diagnostic assessment 2—3 weeks into the school year in two 45- to 50-minute sessions. 
Curriculum Associates recommends three administrations over the course of the school year, 
with 12-18 weeks between each Diagnostic administration. Teachers also are encouraged to 
test technology to ensure proper function and have pencils and paper available as scratch 
paper. Test administrators provide instructions to their students and motivate them to do their 
best. Teachers monitor students as they complete the assessments. 


Multiple studies have been conducted to support the reliability and validity of the reading 
Diagnostic as well as its consistency with education standards used across the United States. 
Since being released in summer 2011, the Diagnostic has been reviewed and approved at the 
national and state level as an assessment, instructional resource, or intervention in Alabama, 
Arizona, Arkansas, California, Colorado, Connecticut, Delaware, Florida, Georgia, Idaho, Indiana, 
lowa, Louisiana, Massachusetts, Michigan, Mississippi, Nebraska, Nevada, New Mexico, New 
York, North Carolina, Ohio, Oklahoma, Oregon, Tennessee, Texas, Utah, and Virginia. 


Between 2017 and 2019, Curriculum Associates conducted linking studies examining the 
relationship between the Diagnostic and 19 state accountability tests, the Partnership for 
Assessment of Readiness for College and Careers (PARCC) test, and the Smarter Balanced 
Assessment (SBA at grades 3-8. These studies provide evidence the Diagnostic measures 
skills consistent with student expectations and can be used as a student reading achievement 
measure. These studies show strong correlations between Diagnostic scores and scores on 
these national and state tests. The average correlations across grades between the 
assessments and the Diagnostic for reading ranged from 0.75 (Kentucky Performance Rating 
for Educational Progress) to 0.84 (Smarter Balanced Assessment). These findings support that 
the Diagnostic content is highly consistent with what students across the United States are 
expected to learn (Curriculum Associates, 2020). 


Required Number of Schools 


We conducted power analyses using PowerUp! (Dong & Maynard, 2013) to identify the 
minimum detectable effect size (MDES) needed to reject the null hypothesis that there is no 
difference in reading achievement between the treatment and comparison group. Statistical 
power is influenced by various factors. We used data from previous studies HumRRO 
conducted using the Diagnostic as an outcome to estimate conservative and optimistic 
parameters for use in the power analysis. These parameters were: (a) approximately 1,000 
schools available for the analyses per grade, (b) an average of six striving learners eligible for 
inclusion at each school and grade, and (c) 0.10 and 0.30 for the intraclass correlation 
coefficient (ICC). Results of the power analyses indicated an MDES of between 0.06 and 0.08 
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with our desired statistical power of 0.80. This level of statistical power provides an 80% chance 
of detecting a statistically significant difference with 95% confidence if one exists. The available 
schools for our analyses across all grades far exceeded the minimum. 


Analytic Samples 


We used a multi-step process to identify analytic samples separately at each grade to address 
the confirmatory research question. First, we conducted school-level matching to identify a set 
of i-Ready (treatment) and a set of comparison schools from which to match students. Next, we 
conducted student-level matching with students within the selected matched schools to identify 
a set of i-Ready students and comparison students equivalent on reading achievement, as 
measured by the Diagnostic. We computed effect sizes for all school- and student-level 
matching variables following matching and found baseline equivalence was achieved according 
to WWC standards (WWC, 2020a). See Appendix A for details of our matching process and 
final school and student samples. 


Analytic Models 


We used hierarchical linear modeling (HLM) to estimate impacts of i-Ready on student 
achievement for the samples of striving learners to address our confirmatory research 
questions. Similarly, we used HLM for our samples of Black or African American and Hispanic 
origin striving learners to address our exploratory research questions. For each analysis, we 
chose a two-level model with level 1 as the student and level 2 as the school. The analytic 
model acted as the basis for our models to estimate baseline differences. This section describes 
the analytic models used for impact analyses and for baseline equivalence. 


Benchmark Impact Model 


We developed a benchmark impact model to address our confirmatory research questions. We 
used hierarchical linear modeling (HLM) to estimate the impact of i-Ready on student 
achievement. For level 2 of our model, we included an indicator variable of group membership 
and school-level variables that were publicly available and known to be related to achievement. 
For level 1 we included baseline Diagnostic performance. 


The student-level covariate used in each analysis was: 


e Diagnostic reading baseline performance 


The school-level covariates included: 


Group membership (0 = comparison, 1 = /-Ready) 

Urbanicity 

Percent of students eligible for free and reduced-price lunch (FRL) 
Percent students of historically marginalized races (HMR) 
Grade-level enrollment 


For additional model details and information on sensitivity analyses to examine the robustness 
of the benchmark impact model, see Appendix B. 


Exploratory Models 
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We analyzed two additional impact models to address the exploratory research questions focused 
on striving learners classified as Black or African American and Hispanic origin. For these models, 
an interaction term was added at Level 1. We added a Black or African American by treatment 
interaction to address our research question, Do Black or African American Striving learners 
experience similar impacts of i-Ready use on student reading achievement compared to striving 
learners overall? We added a Hispanic origin by treatment interaction to address our research 
question, Do striving learners of Hispanic origin experience similar impacts of i-Ready use on 
student reading achievement compared to striving learners overall? Level 2 for both models was 
specified consistent with our baseline model. See Appendix B for additional model details. 


Baseline Difference Model 


We used a baseline difference model to provide a model-based estimate of the difference 
between students in the treatment and comparison groups on the baseline (fall Diagnostic) 
score separately for each grade level. This model is described in Appendix B. 


Results 
Benchmark Impact Analysis 
Striving learners 


Table 1 contains the benchmark impact model results for the samples of striving learners by grade 
for reading spring Diagnostic scores. Full results of the HLM model are available at Appendix C, 
with a discussion of model assumption checks presented in Appendix |. For all grade levels, the 
adjusted mean differences were positive and statistically significant, indicating the i-Ready group 
earned higher reading scores than the comparison group. Hedge’s g effect sizes ranged from 
0.12 to 0.14. Recent research by Kraft (2019) notes traditional guidelines are often too rigid for the 
realities of education evaluations designed to meet the rigor required by the U.S. Department of 
Education, including those developed in accordance with WWC standards. He specifies effect 
size ranges of 0.03 — 0.17 as typical of rigorous education interventions, and these often 
represent a meaningful effect. All effect sizes fall at the upper end of this range. Based on 
findings, we consider the reading effect sizes for all grades modest for an education intervention. 


We computed the improvement index, as defined by the WWC Procedures Handbook (WWC, 
2020a), as an additional measure of impact. The improvement indices range between 4.78 
(grades 2, 4, and 5) and 5.57 (grade 3). Improvement indices show the expected change in 
percentile rank for an average comparison student in this study if they had been in the 
intervention group. For example, an improvement index of 4.78 is equivalent to a student ata 
comparison school improving from the 50" percentile to better than the 54" percentile if they 
were to have participated in the treatment. 


Table 1 also provides the intraclass correlations. The ICCs measure the proportion of the 
variance that is between schools—that is, how much of the variance in reading Diagnostic 
scores can be explained by school-level differences. The ICCs range from 0.12 (grade 4) to 
0.14 (grades 2, 3, and 5). This suggests the majority of variance is due to factors other than 
school-level differences. 
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Table 1. Impact Analysis Results for Striving Learners of i-Ready (Treatment) Schools Compared to Striving Learners of 
Comparison Schools for Reading Student Achievement at Grades 2-5 


ee aati lati Se DJF (e]ale)=)((emmmm BJI= le ]alessi(e3 Adj Mean _ | Effect Improvement 


| Mean )|°SDr —. | | Diff (SE) | pale size Index 

2 i-Ready 1,281 11,464 449.60 41.15 0.14 4.98 (0.70) <.0001 0.12 4.78 
Comparison 1,253 11,464 444.62 39.31 

3 i-Ready 1,443 14,965 482.40 44.22 0.14 6.25 (0.61) <.0001 0.14 5.57 
Comparison 1,404 14,965 476.15 43.49 

4 i-Ready 1,558 12,397 495.64 46.32 | 0.12 5.30 (0.64) <.0001 0.12 4.78 
Comparison 1,502 12,397 490.33 45.25 

5 i-Ready 1,716 22,448 531.56 46.13 0.14 5.60 (0.52) <.0001 0.12 4.78 
Comparison 1,682 22,448 525.97 44.86 


Notes: ICC = intraclass correlation, SD = standard deviation of Diagnostic scores, Adj Mean Diff = adjusted mean difference between i-Ready and comparison 
groups, SE = standard error of the adjusted mean difference, and Effect Size = Hedge’s g. 
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We also provide the gains in reading achievement on the Diagnostic between baseline and 
outcome for the treatment and comparison groups as supplemental information to aid in 
interpreting the impacts presented above. Table 2 presents the mean baseline scores, outcome 
scores, and the gains between these two periods for our ij-Ready and comparison striving 
learner groups at each grade. 


Table 2. Baseline to Outcome Change in Reading Diagnostic Performance for Striving 
Learners in i-Ready (Treatment) Schools Compared to Comparison Schools at Grades 2-5 


Diagnostic | Diagnostic Baseline to 


sSYo} aTele) is} SS) (Te (7a) bs) Baseline Outcome Outcome 
Mean Mean (era) 
2 i-Ready 1,281 11,464 394.87 449.60 54.73 
Comparison 1,253 11,464 395.09 444.62 49.53 
3 i-Ready 1,443 14,965 433.85 482.40 48.55 
Comparison 1,404 | 14,965 434.25 476.15 41.90 
4 i-Ready 1,558 12,397 456.39 495.64 39.25 
Comparison 1,502 12,397 457.11 490.33 33.22 
5 i-Ready 1,716 22,448 498.67 531.56 32.89 
Comparison 1,682 22,448 499.86 525.97 26.11 


As shown in Figures 1—4 below, for each grade, the two striving learner groups start with very 
similar baseline means. Both groups show gains in achievement, however the treatment group 
gains are greater than the comparison group gains. While the gain scores presented in Table 3 
and Figures 1-4 provide a reasonable approximation of achievement gains, caution is 
warranted when interpreting them. Although the gain scores were calculated by subtracting the 
baseline mean from the outcome mean, the difference between the gain scores of the two study 
groups does not provide an accurate or model-based estimate of the impact because they do 
not adjust for covariates. Please refer to Table 1 above for the impact estimates. 
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Figure 1. Gains in reading Diagnostic achievement between baseline and outcome for the 
striving learner treatment and comparison groups at grade 2. 


Main Effect Model - Grade 3 Reading 
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Figure 2. Gains in reading Diagnostic achievement between baseline and outcome for the 
striving learner treatment and comparison groups at grade 3. 
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Figure 3. Gains in reading Diagnostic achievement between baseline and outcome for the 
striving learner treatment and comparison groups at grade 4. 
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Figure 4. Gains in reading Diagnostic achievement between baseline and outcome for the 
striving learner treatment and comparison groups at grade 5. 
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Bottom 20" Percentile Striving Learners 


Table 3 contains the benchmark impact model results for the samples of striving learners at the 
bottom 20" percentile by grade for reading spring Diagnostic scores. Full results of the HLM 
model are available at Appendix D. The findings were similar to those for the full group of 
striving learners — the adjusted mean differences were positive and statistically significant at all 
grade levels, indicating the i-Ready group earned higher reading scores than the comparison 
group. Hedge’s g effect sizes ranged from 0.11 to 0.17, which fall at the upper end of the range 
Kraft (2019) identified as typical of education interventions. Based on findings, we consider the 
reading effect sizes for all grades modest for an education intervention. 


The improvement indices for the analyses examining the impact of /-Ready on the students at 
the bottom 20" percentile range between 4.38 (grade 4) and 6.75 (grade 3). Improvement 
indices show the expected change in percentile rank for an average comparison student in this 
study if they had been in the intervention group. For example, an improvement index of 4.78 is 
equivalent to a student at a comparison school improving from the 50" percentile to better than 
the 54" percentile if they were to have participated in the treatment. 


The ICCs for the bottom 20" percentile student impact analyses range from 0.11 (grade 4) to 
0.14 (grades 2 and 3). This suggests the majority of variance is due to factors other than school- 
level differences. 
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Table 3. Impact Analysis Results for Striving Learners at the Bottom 20" Percentile of i-Ready (Treatment) Schools 
Compared to these Striving Learners of Comparison Schools for Reading Student Achievement at Grades 2-5 


Diagnostic | Diagnostic Xo lm (crete) Effect Improvement 


Toy alele) is) Students (oes 


| | Mean | SD | | Diff(SE) | pale | Sie Index 

2 i-Ready 1,241 9,392 444.58 41.17 0.14 5.10 (0.77) <.0001 0.13 5.17 
Comparison 1,204 9,426 439.47 39.21 

3 i-Ready 1,355 9,809 468.99 44.47 0.14 7.24 (0.75) <.0001 0.17 6.75 
Comparison 1,280 9,784 461.75 43.10 

4 i-Ready 1,499 9,925 488.19 46.64 | 0.11 5.19 (0.71) <.0001 0.11 4.38 
Comparison 1,426 9,961 483.00 45.53 

5 i-Ready 1,598 11,715 509.26 47.80 0.12 5.39 (0.69) <.0001 0.12 4.78 
Comparison 1,492 11,707 503.87 45.41 


Notes: ICC = intraclass correlation, SD = standard deviation of Diagnostic scores, Adj Mean Diff = adjusted mean difference between i-Ready and comparison 
groups, SE = standard error of the adjusted mean difference, and Effect Size = Hedge’s g. 
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Table 4 and Figures 5—8 below present the gains in student achievement for the treatment and 
comparison groups of the bottom 20" percentile striving learners. Both groups show gains in 
achievement, however the treatment group gains are greater than the comparison group gains. 
Although the gain scores presented in Table 4 and Figures 5—8 provide a reasonable 
approximation of achievement gains, caution is warranted when interpreting them. The gain 
scores were calculated by subtracting the baseline mean from the outcome mean; however, the 
difference between the gain scores of the two study groups does not provide an accurate or 
model-based estimate of the impact because they do not adjust for covariates. Please refer to 
Table 3 for the impact estimates. 


Table 4. Baseline to Outcome Change in Reading Diagnostic Performance for the Lowest 
Performing Students in i-Ready (Treatment) Schools Compared to Comparison Schools at 
Grades 2-5 


Diagnostic | Diagnostic | Baseline to 


sSYoqaTele) is Students Baseline Outcome Outcome 
iferetal Mean Gain 
2 i-Ready 1,241 9,392 390.33 444.58 54.25 
Comparison 1,204 9,426 390.56 439.47 48.91 
3 i-Ready 1,355 9,809 418.13 468.99 50.86 
Comparison 1,280 9,784 418.44 461.75 43.31 
4 i-Ready 1,499 9,925 447.90 488.19 40.29 
Comparison 1,426 9,961 448.76 483.00 34.24 
5 i-Ready 1,598 11,715 473.36 509.26 35.90 
Comparison 1,492 11,707 473.80 503.87 30.07 
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Figure 5. Gains in reading Diagnostic achievement between baseline and outcome for the 
bottom 20" percentile striving learner treatment and comparison groups at grade 2. 
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Figure 6. Gains in reading Diagnostic achievement between baseline and outcome for the 
bottom 20" percentile striving learner treatment and comparison groups at grade 3. 
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Figure 7. Gains in reading Diagnostic achievement between baseline and outcome for the 
bottom 20" percentile striving learner treatment and comparison groups at grade 4. 
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Figure 8. Gains in reading Diagnostic achievement between baseline and outcome for the 
bottom 20" percentile striving learner treatment and comparison groups at grade 5. 
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Exploratory Analyses 


This section describes the findings of the analyses we conducted to address our two exploratory 
research questions. We begin with the findings pertaining to Black or African American 
students, followed by the findings for students of Hispanic origin. 


Black or African American by Treatment Interactions 


Table 5 presents the impact model results for striving learners with a Black or African American by 
treatment interaction added. The table includes the number of students and schools for which there 
were complete student-level race data for inclusion in the analysis. Full HLM results are presented 
in Appendix E. For all grade levels, interaction terms were not found to be statistically significant. 
Thus, we conclude that Black or African American striving learners see similar positive impacts of 
i-Ready on reading achievement as striving learners overall. In other words, Black or African 
American striving learners who used /-Ready performed better than Black or African American 
students in a comparison group. One should refer to the benchmark analysis results to examine the 
expected impact, including the effect sizes and improvement indices, for Black or African American 
striving learners. Additional details of the Black or African American by treatment interaction 
analyses, including score differences, are presented in Appendix F. 


Table 5. Summary of Black or African American by Treatment Interactions, by Grade 


Interaction 


i-Ready 7,306 -1.95 0.276 
Comparison 629 5,201 

; i-Ready 952 10,096 -0.51 0.709 
Comparison 716 7,292 

; i-Ready 1075 8,796 1.02 0.491 
Comparison 764 5,992 

: i-Ready 1,228 16,615 -0.97 0.362 
Comparison 858 11,414 


Hispanic Origin by Treatment Interactions 


Table 6 contains the impact model results for striving learners with a Hispanic origin by 
treatment interaction added. The table includes the number of students and schools for which 
there were complete student-level race data for inclusion in the analysis. Full HLM results are 
presented in Appendix G. The interactions were not statistically significant at grades 2, 3, and 4. 
For these grade levels we conclude that striving learners of Hispanic origin see similar positive 
impacts of i-Ready on reading achievement as striving learners overall. In other words, students 
of Hispanic origin who used /-Ready performed better than students of Hispanic origin ina 
comparison group. One should refer to the benchmark analysis results to examine the impact 
one should expect for striving learners of Hispanic origin at these grades. A statistically 
significant positive interaction was found at grade 5, indicating grade 5 striving learners of 
Hispanic origin benefitted more from /-Ready than similar students not of Hispanic origin. 
Additional details of the Hispanic origin by treatment interaction analyses, including score 
differences, are presented in Appendix H. 
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Table 6. Summary of Hispanic Origin by Treatment Interactions by Grade 


Interaction 


i-Ready 8,218 0.06 0.966 
Comparison 676 6,087 

5 i-Ready 989 11,185 0.77 0.495 
Comparison 773 8,321 

‘A i-Ready 1,123 9,507 1.95 0.112 
Comparison 812 7,002 

a i-Ready 1,260 17,495 1.74 0.046 
Comparison 894 12,619 


Summary and Discussion 


Our study findings suggest implementing i-Ready for reading in schools, when used by students 
with fidelity, has a positive impact on student reading achievement for striving learners in 
elementary grades 2-5. This includes striving learners who tested two or more grade levels 
below their current grade in reading achievement at baseline, and a subset of these students 
who fell at or below the 20" percentile in reading achievement. At each grade, striving learners 
who received /-Ready performed statistically significantly better on the reading Diagnostic than 
those in a comparison group. 


Effect sizes provide additional support for i-Ready ’s effectiveness with striving students overall 
and a subset of those at the bottom 20" percentile. Effect sizes for all grades and for both 
groups were at the upper end of what Kraft (2019) indicates as typical and potentially 
meaningful in education. Kraft (2019) suggests effect sizes should be considered in conjunction 
with all aspects of an intervention, including the magnitude of the treatment contrast and costs. 
Because /-Ready is personalized online learning intended as a supplemental activity to curricula 
and not an intense intervention, we consider the contrast between treatment and comparison to 
be relatively low. Thus, we consider the effect sizes for impacts on reading highly promising. 
Moreover, the comparison group implemented the Diagnostic, which may have attenuated 
treatment effects for /-Ready. 


Further, this study suggests /-Ready is equally effective for Black or African American striving 
learners as it is for all striving learners. The study also suggests /-Ready is equally effective for 
striving learners of Hispanic origin at grades 2, 3, and 4 as it is for all striving learners. 
Therefore, Black or African American striving learners in the /-Ready group performed better 
than Black or African American striving learners in the comparison group, and striving learners 
of Hispanic origin in the i-Ready group performed better than striving learners of Hispanic origin 
in the comparison group. 


A positive Hispanic origin by treatment interaction suggests students of Hispanic origin at grade 
5 benefited more from /-Ready than students not of Hispanic origin. Similar to the results of 
other grades, striving learners of Hispanic origin at grade 5 performed better than striving 
learners of Hispanic origin in the comparison group. In addition, the positive interaction indicated 
striving learners of Hispanic origin in the i-Ready group saw greater benefit from i-Ready than 
students in the /-Ready group who were not of Hispanic origin. In other words, /-Ready provided 


Impact Evaluation of Reading i-Ready for Striving learners Using 2018-19 Data 19 


PS HuMRRO 


additional benefit to Hispanic origin treatment group students over and above what it provided to 
other striving learners. Because we only saw this impact in one of the four grades examined, we 
recommend Curriculum Associates conduct additional studies to examine the impact of i-Ready 
on striving learners of Hispanic origin to determine if this was an anomaly or a meaningful 
finding. 


Kraft (2019) points out that the U.S. education system is decentralized, and implementation 
procedures are ultimately controlled by local schools and/or teachers. As a QED, this study did 
not attempt to control for curriculum, Supplemental resources, or classroom structure. Students 
in both groups were not participants in a research study but rather were everyday users, and 
i-Ready was carried out in real-world conditions. We may have found even larger effect sizes 
had the study been conducted under more controlled circumstances. Impacts are typically 
greater for studies that aim for ideal or close to ideal implementation and less for studies that 
examine real-world implementation. The findings from this study, therefore, should be 
considered quite promising given that statistically significant impacts with modest to strong 
effect sizes were detected for all grade levels in the context of real-world implementation. 


Our study was conducted as a rigorous QED to meet the current standards described by the 
WWC (WWC, 2020b) to achieve a rating of Meets WWC Group Design Standards with 
Reservations. In addition, because we found statistically significant positive effects for all 
grades, this study meets the guidelines set forth by ESSA for a Level 2 (or Moderate) rating for 
evidence-based research (U.S. Department of Education, 2016). 


Limitations and Implications for Future Studies 


This study provides strong evidence supporting the impact on reading achievement from 
i-Ready use for striving learners. This study, however, is not without some limitations. 


First, our study was a QED with the typical limitations, including a lack of information on 
implementation decisions made at each school and within each classroom. We recommend 
randomized control trials (RCTs) in the future and collecting implementation fidelity information 
from treatment schools as well as collecting information about programs within comparison 
schools that might be similar in nature to /-Ready. We suggest including only one district to 
allow greater control of implementation and fewer confounds. 


Next, our Black or African American and Hispanic origin interactions required use of student- 
level race and ethnicity demographic data. These data were missing for several students in the 
i-Ready usage datasets provided. We recommend Curriculum Associates continue ongoing 
efforts to increase the likelihood schools will provide this information for their students. 


Finally, our treatment group was compared to a matched comparison group using the 
Diagnostic. It is possible that use of the Diagnostic itself increases student achievement. 
However, the design of this study did not allow for an estimation of that impact. Use of the 
Diagnostic only schools and students as a comparison group may have attenuated the effects of 
i-Ready use had this treatment group been compared to a “business as usual” comparison 
group. Future studies might examine the impact of i-Ready using a set of comparison schools 
and students not implementing any Curriculum Associates products. This would require an 
external achievement measure, potentially a state assessment, as the baseline and outcome 
measure. 
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Quality Control Procedures 


We employed various quality control checks throughout the data cleaning, analysis, and 
reporting processes. HumRRO, Curriculum Associates, and Century Analytics worked together 
to identify a rigorous methodology based on implementation of i-Ready with fidelity, the WWC 
4.1 standards, and ESSA Level 2 guidelines. 


Eligibility criteria for the treatment and comparison groups were determined through 
collaboration between the three study partners. Curriculum Associates provided information on 
the various components of /-Ready and the frequency for which it should be used for 
implementation with fidelity. They also provided i-Ready data to allow HumRRO and Century 
Analytics to empirically examine the extent to which these recommendations were followed by 
i-Ready schools. These discussions led to treatment and comparison group criteria in which all 
partners were confident. 


Data analysis work was completed collaboratively by HumRRO and Century Analytics. Century 
Analytics and HumRRO independently conducted matching and HLM analyses for each grade. 
The researchers reviewed results against each other and worked out any discrepancies. All 
results reported in this study were verified by researchers from both organizations. 
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Appendix A 
Selection of Analytic Samples 


This appendix describes the process for identifying our analytic samples used to address 
confirmatory and exploratory research questions. 


Confirmatory Analytic Samples 


We used a multi-step process to identify analytic samples separately at each grade to address 
the confirmatory research question. First, we conducted school-level matching to identify a set 
of treatment and a set of comparison schools from which to match students. We matched 
schools on three school-level variables: (a) percent of students of historically marginalized race 
(i.e., a variable combining students identified as Black or African American, Asian, Pacific 
Islander, American Indian/Alaskan Native, or two or more races), (b) percent of students eligible 
for free or reduced-price lunch, and (c) grade-level enrollment. School-level matching was 
conducted separately for each grade and stratified by four levels of school urbanicity (city, 
suburb, town, and rural), resulting in four matched sets of schools. Table A.1 presents the 
average demographic composition of our three matching variables for our final matched reading 
i-Ready and comparison schools, and the effect size of the difference between the groups. As 
shown, the sets of treatment and comparison schools are baseline equivalent on our three 
matching variables. 


Table A.1. Demographic Characteristics of Matched Reading i-Ready (Treatment) and 
Comparison Schools, by Grade 


i-Ready Mean Comparison Mean 
(Sp) (S12) 


FRL Percent 49.68 (26.35) 47.36 (27.56) 0.09 

2 HMR Percent 47.59 (29.13) 46.00 (29.01) 0.05 
Grade Enrollment 83.60 (36.79) 81.25 (36.00) 0.06 

FRL Percent 51.42 (26.81) 49.60 (29.09) 0.07 

3 HMR Percent | 50.26 (29.99) 48.05 (30.70) 0.07 
Grade Enrollment | 85.35 (41.57) 82.43 (37.97) 0.07 

FRL Percent 54.51 (26.30) 51.27 (28.85) 0.12 

4 HMR Percent 51.52 (30.47) 49.10 (31.21) 0.08 
Grade Enrollment 86.62 (41.82) 83.02 (38.65) 0.09 

_FRL Percent 57.66 (25.60) | 51.87 (29.20) | 0.21 

5 HMR Percent | 54.69 (30.08) 49.71 (31.35) 0.16 
_ Grade Enrollment | 88.34 (45.26) | 83.90 (44.52) _ 0.10 


Note. HMR = historically marginalized race; FRL = free or reduced lunch; SD = standard deviation of variables; 
Effect Size = Cohen’s d. 


Next, at each grade level we reduced our datafile to include only striving learners from the 
matched schools. To conduct student-level matching, we first stratified to ensure a treatment 
student was matched to a comparison student with the exact same placement levels on all 
reading domains reported by the Diagnostic. We then used propensity score matching to match 
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on composite fall Diagnostic scores within each stratum. Once matching was complete by 
stratum, we combined students across strata to generate one set of matched students for each 
grade, resulting in four separate analytic samples. 


We calculated effect sizes to compare differences in student baseline achievement between our 
treatment and comparison groups in each grade using our planned baseline difference model 
(described in the next section). For all grades, Hedges’ g was smaller than 0.25 after matching, 
and thus considered baseline equivalent (WWC, 2020a). Table A.2 presents our final matched 
samples of students. 


We then reduced our sample to include only a subset of striving learners in the bottom 20" 
percentile. We examined the effect size differences between the treatment and comparison 
groups at each grade and similarly found all groups met baseline equivalence (see Table A.3). 


Table A.2. Reading Baseline Equivalence Statistics for i-Ready (Treatment) and 
Comparison Striving Learner Groups, by Grade 


1B) Fe ]ae)=1 (eum mm BJ le |ale)sj 116 Noli (eretal 


Schools | Students 


Mean 51D) Diff (SE) 

: i-Ready 1,281 11,464 394.87 22.70 | -0.22 (0.38) -0.010 
Comparison 1,253 11,464 395.09 22.59 

3 i-Ready 1,443 14,965 433.85 30.88  -0.40 (0.48) -0.013 
Comparison 1,404 14,965 434.25 30.22 

j i-Ready 1,558 12,397 456.39 34.63 -0.72 (0.56) -0.021 
Comparison 1,502 12,397 457.11 34.04 

2 i-Ready 1,716 22,448 498.67 37.48 | -1.19 (0.55) -0.032 
Comparison 1,682 22,448 499.86 37.35 


Note. SD = standard deviation of variables; SE = standard error; Effect Size = Hedge’s g. 


Table A.3. Reading Baseline Equivalence Statistics for i-Ready (Treatment) and 
Comparison Bottom 20" Percentile Striving Learner Groups, by Grade 
| 


Grade Group Sehosls | Stidenis 1D) Fe ]ale)=1((omm mm BY le |ale)s} 116 Adj Mean | Effect 


Mean 51D) Diff (SE) Size 
j i-Ready 1,241 9,392 390.33 22.59 -0.23 (0.40) -0.010 
Comparison 1,204 9,426 390.56 22.47 
i-Ready 1,355 9,809 418.13 26.57 ~—--0.30 (0.46) -0.012 
Comparison 1,280 9,784 418.44 25.62 
P i-Ready 1,499 9,925 447.90 33.57 -0.86 (0.57) -0.026 
Comparison 1,426 9,961 448.76 32.96 
é i-Ready 1,598 11,715 473.36 34.70 -0.44 (0.58) -0.013 
Comparison 1,492 11,707 473.80 34.57 


Note. SD = standard deviation of variables; SE = standard error; Effect Size = Hedge’s g. 
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Exploratory Analytic Samples 


For our exploratory research questions focused on Black or African American striving learners 
and those of Hispanic origin, we modified our sample to include only students from our matched 
sample with the necessary demographic data. To be included in the Black or African American 
analysis, a student needed to have complete race data. To be included in the Hispanic origin 
analysis, student needed to have complete Hispanic origin data. We determined this was 
preferred to imputation because (a) we had large enough sample sizes of students and schools 
with available data to have sufficient power, and (b) internal analysis suggested demographic 
data were missing at random. Though we did not conduct additional matching, we established 
baseline equivalence for the analyses for which we found significant interactions. 


We conducted a post-hoc examination of baseline equivalence for the only significant 
interaction—Hispanic origin by treatment at grade 5. Particularly, we sought to confirm baseline 
equivalence was achieved for students of Hispanic origin in the i-Ready and comparison 
groups. Table A.4 summarizes these findings. As shown, the adjusted mean difference between 
these two groups is minimal, with an effect size of 0.136. Therefore, our students of Hispanic 
origin were considered baseline equivalent. 


Table A.4. Reading Baseline Equivalence Statistics for Grade 5 Students of Hispanic 
Origin of the i-Ready (Treatment) and Comparison Groups 


Shdeanic ee ID) FeYolatessiale Ail Effect Size 


i-Ready 17,495. 495.64 37.11 5.04 -0.136 


Comparison 12,619 500.68 | 37.07 
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Appendix B 
Analytic Model Descriptions 


Benchmark Model 
Level 1 of the benchmark model was specified as: 
Yij = BOj + B1j(BASELINEj) + eij 


where Yij is the spring Diagnostic score for student i in school j. BOj is the adjusted mean 
outcome for students in school j. 81j is the regression slope of the student’s baseline (fall) 
Diagnostic score for school j. eij is the random error in the outcome associated with student | in 
school j not accounted for in the model. 


Level 2 of the model was specified as: 


BOj = yO0 + yO1(TREAT}) + Zyq(URBANICITY}) + yO2(%FRL) + yO3(%HMR) + 
y05(ENROLL) + u0j 


B1j = y10 


where y00 is the adjusted comparison group grand mean of the outcome, y01 is the adjusted 
mean difference in the outcome between school study groups, and TREAT is an indicator 
variable coded as 1 for schools in the i-Ready treatment group and 0 for schools in the 
comparison group. Zyq is a vector of indicator variables for school urbanicity (city, suburb, town, 
rural). yO2 - y04 are regression slopes of the school-level covariates. ENROLL is number of 
students enrolled at the grade level in the analysis. uOj is the random error in the achievement 
outcome associated with school j. 


We conducted two sensitivity analyses for each grade to examine the robustness of the findings of 
the benchmark impact model for the samples of striving learners. The first sensitivity analyses 
included a school-level grand mean centered baseline covariate. The second included student- 
level Diagnostic domain level scores to account for the stratification and matching of students 
within fall Diagnostic placement profiles. Both analyses yielded results consistent with our 
benchmark model. 


Sensitivity Analysis 1. The first sensitivity analysis examined the robustness of the findings to 
including a school-level grand mean centered baseline covariate. Level 1 of the model had the 
same specification as the benchmark model. Level 2 of the models was specified as follows: 


BO; = yOO + y01(TREAT,) + yO2(BASELINE.;— BASELINE..); + Zyq (URBANCITY)) + 
y03(%FRL) + y04(%HMR) + yO5(ENROLL) + uO; 


where y00 is the adjusted comparison group grand mean of the outcome, y01 is the adjusted 
mean difference in the outcome between school study groups, and TREAT is an indicator 
variable coded as 1 for schools in the i-Ready treatment group and 0 for schools in the 
comparison group. y02 is the regression slope of the school-level baseline Diagnostic score 
(grand mean centered). Zyq and y03 - y05 are regression slopes of the school-level covariates 
specified as described in the benchmark model. ENROLL is number of students enrolled at the 
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grade level in the analysis. uO; is the random error in the achievement outcome associated with 
school ;. 


Results of this model were consistent with the benchmark model findings for all grades: i-Ready 
had statistically significant impacts on student reading achievement, and impact estimates were 
similar to those from the benchmark model. 


Sensitivity Analysis 2. The second sensitivity analysis we conducted examined the robustness 
of the findings to including student level Diagnostic domain level scores to account for the 
stratification and matching of students within fall -Ready domain placement profiles. Level 2 of 
this model as the same specification as Level 1 of the benchmark model. Level 1 of this 
sensitivity analysis model is specified as follows: 


Y; = BO; + B1(BASELINE,) + B2(DOMAIN1,;) + B3(DOMAIN2,) + B4(DOMAIN3,) + ey 


where Yj is the spring Diagnostic score for student ; in school ;. 80; is the adjusted mean 
outcome for students in school ;. 81; is the regression slope of the student’s baseline (fall) 
Diagnostic score for school ;. B2; — 84; are regression slopes of the baseline (fall) Diagnostic 
domain scores for student ; in school ;. The reading domain scores at grades 2—5 include 
Vocabulary, Comprehension, Phonics, and High-Frequency Words. Grade 2 also includes a fifth 
domain, Phonological Awareness. e; is the random error in the outcome associated with student 
iin school ; not accounted for in the model. 


Results of this model were consistent with the benchmark model findings for all grades. i-Ready 
had statistically significant impacts on student reading achievement, and impact estimates were 
similar to those from the benchmark model. 
Exploratory Models 

We analyzed two additional models to address the exploratory research questions. For these 
models, an interaction term was added at Level 1. To address our research question focused on 
Black or African American striving learners, we defined Level 1 as: 

Yi = BO; + B1(BASELINE,) + B2(BLACK OR AFRICAN AMERICAN))+ ej 


Level 2 of the model was specified as: 


BO; = yOO + yO1(TREAT,) + yO2(URBANICITY,) + y03(%FRL) + y04(%HMR) + 
yO5(ENROLL) + u0; 


B1;=y10 
B2; = y20 + y21(TREAT) 


To address our research question focused on striving learners of Hispanic origin, we defined 
Level 1 as: 


Yj = BO; + B1(BASELINE,) + B2(HISPANIC;)+ e; 


Level 2 of the model was specified as: 
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BO; = yOO + yO1(TREAT,)) + yO2(URBANICITY;) + y03(%FRL) + yO4(%HISTORICALLY 
MARGINALIZED RACE) + y05(ENROLL) + uO; 
B1;= 10 
B2; = y20 + y21(TREAT)) 


Level 1 of the model has the same specification as the benchmark model except for the addition 
of an indicator variable for student membership in either the Black or African American or 
Hispanic origin group. Level 2 of the model also had the same specification of the benchmark 
model except for the addition of the cross-level interaction of student race/ethnicity and 
treatment group status. 


The two exploratory models were run separately with each of the four analytic samples. 
Baseline Difference Model 

We used a baseline difference model to provide a model-based estimate of the difference 
between students in the treatment and comparison groups on the baseline (fall Diagnostic) 
score separately for each grade level. 

Yi = BO; + ej 
where Yj is the fall Diagnostic score for student ; in school ;. BO; is the adjusted mean outcome 
for students in school ;. ej is the random error in the outcome associated with student ; in school 
j not accounted for in the model. 
Level 2 of the model was specified as follows: 

BO; = yOO + yO1(TREAT)) + u0; 
where y00 is the adjusted comparison group grand mean of the fall baseline score, y0O1 is the 
adjusted mean difference in the baseline score between school study groups, and TREAT is an 


indicator variable coded as 1 for schools in the /-Ready treatment group and 0 for schools in the 
comparison group. uO; is the random error in the achievement outcome associated with school ;. 
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Appendix C 
Confirmatory Impact HLM Coefficients for Striving Learners 


Table C.1. Grade 2 Reading HLM Results for Striving Learners 


Covariates Coef. SE z p-value | 95% Conf. Interval 
Student-Level Covariates 
Fall 2018 i-Ready Diagnostic Score 0.79 0.01 77.86 | <0.001 0.77 0.81 
School-Level Covariates 
Treatment Group Membership 4.98 0.70 7.10) <0.001 3.60 6.36 
Urbanicity — City* 0.91 1.27 0.71 0.475 -1.58 3.39 
Urbanicity — Suburban* 0.24 1.18 0.20 0.841 -2.08 2.55 
Urbanicity — Town* -1.09 | 1.66 -0.65 0.513 -4.35 2.17 
Percent Free or Reduced Lunch -12.25 1.69 -7.27 | <0.001 -15.55 -8.94 
Percent HMR -10.85 | 1.64 -6.62 | <0.001 -14.06 -7.64 
Grade-level Enrollment 0.00 0.01 -0.15 0.88 -0.02 0.02 
Intercept 
Intercept 144.20 4.33 33.28 | <0.001 135.71 152.69 


Notes: *Rural is the reference group for urbanicity; HMR = historically marginalized race 


Table C.2. Grade 3 Reading HLM Results for Striving Learners 


Covariates Coef. SE z p-value | 95% Conf. Interval 
Student-Level Covariates 
Fall 2018 i-Ready Diagnostic Score 0.88 0.01 138.84  <0.001 0.86 0.89 
School-Level Covariates 
Treatment Group Membership 6.25 0.61 10.32 | <0.001 5.06 7.43 
Urbanicity — City* 1.26 1.04 1.20 0.229 -0.79 3.30 
Urbanicity — Suburban* 1.29 0.98 1.31 0.189 -0.63 3.21 
Urbanicity — Town* -2.50 1.48 -1.69 0.092 -5.40 0.41 
Percent Free or Reduced Lunch -8.45 1.47 -5.77 | <0.001 -11.33 -5.58 
Percent HMR -14.48 1.39 -10.41 <0.001 -17.21 -11.75 
Grade-level Enrollment 0.03 0.01 3.56 | <0.001 0.01 0.04 
Intercept 
Intercept 104.95 3.02 34.74 <0.001 99.03 110.87 


Notes: *Rural is the reference group for urbanicity; HMR = historically marginalized race 
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Table C.3. Grade 4 Reading HLM Results for Striving Learners 


Covariates Coef. SE z p-value | 95% Conf. Interval 
Student-Level Covariates 
Fall 2018 i-Ready Diagnostic Score 0.86 0.01 137.31 | <0.001 0.85 0.87 
School-Level Covariates 
Treatment Group Membership 5.30 0.64 8.30  <0.001 4.05 6.56 
Urbanicity — City* 105 1.12 0.93 0.351 -1.15 3.25 
Urbanicity — Suburban* 0.51 1.07 0.47 0.636 -1.60 2.61 
Urbanicity — Town* -1.08 | 1.50 -0.72 0.473 -4.01 1.86 
Percent Free or Reduced Lunch -5.16 1.59 -3.24 0.001 -8.28 -2.03 
Percent HMR -10.57 | 1.47 -7.21 | <0.001 -13.45 -7.70 
Grade-level Enrollment 0.02 0.01 2.44 0.015 0.00 0.04 
Intercept 
Intercept 103.33 | 3.20 32.32 | <0.001 97.06 109.60 


Notes: *Rural is the reference group for urbanicity; HMR = historically marginalized race 


Table C.4. Grade 2 Reading HLM Results for Striving Learners 


Covariates Coef. SE z p-value | 95% Conf. Interval 
Student-Level Covariates 
Fall 2018 i-Ready Diagnostic Score 0.83 0.00 203.31 <0.001 0.82 0.84 
School-Level Covariates 
Treatment Group Membership 5.60 0.52 10.71 <0.001 4.57 6.62 
Urbanicity — City* 0.13 =0.89 0.15 0.882 -1.62 1.88 
Urbanicity — Suburban* 0.17 0.85 0.20 0.842 -1.49 1.83 
Urbanicity — Town* -1.17 1.18 -1.00 0.319 -3.49 1.14 
Percent Free or Reduced Lunch -9.33 1.27 -7.36 | <0.001 -11.82 -6.85 
Percent HMR -6.52 1.18 -5.51 | <0.001 -8.85 -4.20 
Grade-level Enrollment 0.00 0.01 -0.28 0.78 -0.01 0.01 
Intercept 
Intercept 119.83 2.31 51.88 | <0.001 115.30 124.36 


Notes: *Rural is the reference group for urbanicity; HMR = historically marginalized race 
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Appendix D 
Confirmatory Impact HLM Coefficients for Striving Learners at the Bottom 20 
Percentile 


Table D.1. Grade 2 Reading HLM Results for Striving learners at the Bottom 20" Percentile 


Covariates Coef. SE z p-value | 95% Conf. Interval 
Student-Level Covariates 
Fall 2018 i-Ready Diagnostic Score 0.73 0.01 63.41 0.000 0.71 0.75 
School-Level Covariates 
Treatment Group Membership 5.10 | 0.77 6.67 0.000 3.60 6.60 
Urbanicity — City* 0.42 1.39 0.30 0.766 -2.32 3.15 
Urbanicity — Suburban* 0.00 1.31 0.00 0.998 -2.56 2.56 
Urbanicity — Town* -1.68 1.82 -0.92 0.357 -5.25 1.89 
Percent Free or Reduced Lunch -12.14 1.84 -6.59 0.000 -15.76 -8.53 
Percent HMR -10.60 | 1.78 -5.94 0.000 -14.10 -7.10 
Grade-level Enrollment 0.00 0.01 -0.05 0.959 -0.02 0.02 
Intercept 
Intercept 167.40 4.83 34.67 0.000 157.94 176.86 


Notes: *Rural is the reference group for urbanicity; HMR = historically marginalized race 


Table D.2. Grade 2 Reading HLM Results for Striving learners at the Bottom 20" Percentile 


Covariates Coef. SE z p-value | 95% Conf. Interval 
Student-Level Covariates 
Fall 2018 i-Ready Diagnostic Score 0.89 0.01 91.85 0.000 0.88 0.91 
School-Level Covariates 
Treatment Group Membership 7.24 | 0.75 9.62 0.000 5.76 8.71 
Urbanicity — City* 1.87 1.31 1.43 0.154 -0.70 4.44 
Urbanicity — Suburban* 2.11 | 1.24 1.70 0.089 -0.32 4.54 
Urbanicity — Town* -2.48 | 1.85 -1.34 0.179 -6.11 1.14 
Percent Free or Reduced Lunch -9.02 1.84 -4,.91 0.000 -12.62 -5.43 
Percent HMR -15.69 1.73 -9.07 0.000 -19.08 -12.30 
Grade-level Enrollment 0.03 0.01 3.16 0.002 0.01 0.05 
Intercept 
Intercept 96.85 4.39 22.05 0.000 88.25 105.46 


Notes: *Rural is the reference group for urbanicity; HMR = historically marginalized race 
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Table D.3. Grade 4 Reading HLM Results for Striving learners at the Bottom 20" Percentile 


Covariates Coef. SE z p-value | 95% Conf. Interval 
Student-Level Covariates 
Fall 2018 i-Ready Diagnostic Score 0.86 0.01 115.91 0.000 0.85 0.88 
School-Level Covariates 
Treatment Group Membership 5.19 | 0.71 7.28 0.000 3.79 6.59 
Urbanicity — City* 0.96 1.26 0.76 0.446 -1.51 3.43 
Urbanicity — Suburban* 0.51 1.21 0.43 0.670 -1.85 2.88 
Urbanicity — Town* -0.94 1.67 -0.56 0.574 -4.21 2.34 
Percent Free or Reduced Lunch -5.48 1.79 -3.06 0.002 -8.99 -1.96 
Percent HMR -10.03 | 1.64 -6.13 0.000 -13.24 -6.82 
Grade-level Enrollment 0.03 0.01 2.81 0.005 0.01 0.04 
Intercept 
Intercept 101.36 3.71 27.32 0.000 94.09 108.63 


Notes: *Rural is the reference group for urbanicity; HMR = historically marginalized race 


Table D.4, Grade 5 Reading HLM Results for Striving learners at the Bottom 20 Percentile 


Covariates Coef. SE z p-value | 95% Conf. Interval 
Student-Level Covariates 
Fall 2018 i-Ready Diagnostic Score 0.83 0.01 124.38 0.000 0.82 0.85 
School-Level Covariates 
Treatment Group Membership 5.39 0.69 7.81 0.000 4.04 6.74 
Urbanicity — City* -0.44 | 1.20 -0.36 0.716 -2.78 1.91 
Urbanicity — Suburban* -0.43 ) 1.14 -0.37 0.710 -2.67 1.82 
Urbanicity — Town* -2.03 | 1.57 -1.29 0.195 -5.11 1.04 
Percent Free or Reduced Lunch -8.55 1.70 -5.04 0.000 -11.87 -5.23 
Percent HMR -7.92 | 1.56 -5.08 0.000 -10.98 -4.86 
Grade-level Enrollment 0.01 0.01 0.78 0.436 -0.01 0.02 
Intercept 
Intercept 117.54 3.50 33.55 0.000 110.67 124.40 


Notes: *Rural is the reference group for urbanicity; HMR = historically marginalized race 
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Appendix E 
Exploratory HLM with Black or African American by Treatment Interaction 


Table E.1. HLM Results for Grade 2 Reading with Black or African American by Treatment 
Interaction 


Covariates Coef. SE z p-value 95% Conf. Interval 
Student-Level Covariates 
Fall 2018 i-Ready Diagnostic Score 0.82 0.01 | 57.53 <0.001 0.79 0.85 
Black or African American -1.37 1.42 -0.97 0.334 -4.14 1.41 
Black or African American by Treatment -1.95 | 1.79 -1.09 0.276 -5.46 1.56 
School-Level Covariates 
Treatment Group Membership 5.02 1.01 5.00 <0.001 3.05 7.00 
Urbanicity — City* -0.83 | 1.72 -0.48 0.630 -4.21 2.55 
Urbanicity — Suburban* -0.20 | 1.58 -0.12 0.901 -3.29 2.90 
Urbanicity — Town* -2.38 | 2.16 -1.10 0.272 -6.62 1.86 
Percent Free or Reduced Lunch | -13.04 2.37 -5.51 <0.001 -17.68 -8.40 
Percent HMR -9.30 | 2.31 -4.02 <0.001 -13.83 -4.77 
Grade-level Enrollment -0.01 | 0.01 -0.61 0.539 -0.03 0.02 
Intercept 
Intercept | 137.03 | 6.02 | 22.75 <0.001 125.22 148.84 


Notes: *Rural is the reference group for urbanicity; HMR = historically marginalized race 


Table E.2. HLM Results for Grade 3 Reading with Black or African American by Treatment 
Interaction 


Covariates Coef. SE z p-value 95% Conf. Interval 
Student-Level Covariates 
Fall 2018 i-Ready Diagnostic Score 0.86 0.01 | 102.36 <0.001 0.84 0.87 
Black or African American -3.81 1.08 -3.53 <0.001 -5.93 -1.69 
Black or African American by Treatment -0.51 1.36 -0.37 0.709 -3.18 2.17 
School-Level Covariates 
Treatment Group Membership 6.46 0.86 7.51 <0.001 4.77 8.15 
Urbanicity — City* -0.05 1.39 -0.03 0.973 -2.77 2.67 
Urbanicity — Suburban* 1.27 | 1.28 1.00 0.319 -1.23 3.78 
Urbanicity — Town* -3.01 1.91 -1.58 0.115 -6.74 0.73 
Percent Free or Reduced Lunch -8.54 1.98 -4.32 <0.001 -12.42 -4.67 
Percent HMR -13.34 1.91 -7.00 <0.001 -17.07 -9.60 
Grade-level Enrollment 0.04 0.01 3.62 <0.001 0.02 0.05 
Intercept 
Intercept | 113.36) 403 28.13 <0.001 105.46 121.26 


Notes: *Rural is the reference group for urbanicity; HMR = historically marginalized race 
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Table E.3. HLM Results for Grade 4 Reading with Black or African American by Treatment 
Interaction 


Covariates Coef. SE z p-value 95% Conf. Interval 
Student-Level Covariates 
Fall 2018 i-Ready Diagnostic Score 0.86 0.01 | 105.40 <0.001 0.85 0.88 
Black or African American -5.31 | 1.20 -4.44 <0.001 -7.66 -2.97 
Black or African American by Treatment 102 1.49 0.69 0.491 -1.89 3.94 
School-Level Covariates 
Treatment Group Membership 4.60) 0.91 5.04 <0.001 2.81 6.39 
Urbanicity — City* -0.30 | 1.49 -0.20 0.841 -3.21 2.61 
Urbanicity — Suburban* -0.58 | 1.39 -0.42 0.678 -3.30 2.15 
Urbanicity — Town* -2.53 | 1.92 -1.32 0.187 -6.28 1.23 
Percent Free or Reduced Lunch -6.81 | 2.12 -3.22 0.001 -10.95 -2.66 
Percent Historically marginalized race -6.42 1.97 -3.26 0.001 -10.27 -2.56 
Grade-level Enrollment 0.03 0.01 3.29 0.001 0.01 0.05 
Intercept 
Intercept | 103.55 4.20 24.68 <0.001 95.33 111.78 


Notes: *Rural is the reference group for urbanicity; HMR = historically marginalized race 


Table E.4. HLM Results for Grade 5 Reading with Black or African American by Treatment 
Interaction 


Covariates Coef. SE z p-value 95% Conf. Interval 
Student-Level Covariates 
Fall 2018 i-Ready Diagnostic Score 0.82 0.01 | 156.63 <0.001 0.81 0.83 
Black or African American -3.02 | 0.86 -3.50 <0.001 -4.71 -1.33 
Black or African American by Treatment -0.97 1.06 -0.91 0.362 -3.05 1.11 
School-Level Covariates 
Treatment Group Membership 5.09 | 0.70 7.26 <0.001 3.72 6.47 
Urbanicity — City* -0.52 1.10 -0.48 0.635 -2.69 1.64 
Urbanicity — Suburban* -0.28 | 1.02 -0.27 0.786 -2.28 1.73 
Urbanicity — Town* -1.83 1.44 -1.27 0.204 -4.65 1.00 
Percent Free or Reduced Lunch | -10.37 1.63 -6.37 <0.001 -13.56 -7.18 
Percent Historically marginalized race -3.75 1.52 -2.47 0.014 -6.74 -0.77 
Grade-level Enrollment 0.00 0.01 -0.38 0.705 -0.02 0.01 
Intercept 
Intercept | 125.14 2.97 |) 42.11 <0.001 119.32 130.97 


Notes: *Rural is the reference group for urbanicity; HMR = historically marginalized race 
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Appendix F. Impacts and Baseline to Outcome Gains 
for Black or African American by Treatment Interactions 


Table F.1 summarizes the achievement differences between of Black or African American i-Ready 
and comparison striving learners by grade. We did not find a statistically significant interaction at 
any grade, so the differences between groups is not statistically significantly different. The 
adjusted mean differences between /-Ready and comparison groups ranged from just above 3 
points (grade 2) and almost 6 points (grade 3). The effect sizes ranged from 0.08 (grade 2) to 0.14 
(grade 3) and improvement indices ranged from 3.19 (grade 2) to 5.57 (grade 3). 


Table F.1. Differences between i-Ready (Treatment) and Comparison Black or African 
American groups, by Grade. 


| 
Diagnustic Diegneste Adj. Mean | Effect | Improvement 


Piave Swdems S)B) Difference | Size Hatel=4 

P Ready 1510 451.36 39.88 3.07 0.08 3.19 
Comparison 794 448.29 38.08 

; Ready 2,654 484.10 42.39. 5.95 0.14 5.57 
Comparison 1,494 478.14 43.75 | | 

J iReady 2,399 496.56 45.18 5.63 0.12 4.78 
Comparison 1,354 490.93 45.27 

: Ready 4,247 532.64 45.82 4.12 0.09 3.59) 


Comparison 2,293 528.52 46.29 | | 


Notes: SD = standard deviation of Diagnostic scores, Adj Mean Diff = adjusted mean difference between i-Ready and 
Comparison groups, SE = standard error of the adjusted mean difference, and Effect Size = Hedge’s g. 


Table F.2 provides the mean change in the reading Diagnostic between baseline and outcome 
for Black or African American and non-Black or African American i-Ready and Comparison 
groups by grade. They are illustrated in Figures F.1-F.4. The means were generated by the 
interaction models presented in Tables E.1 through E.4 of Appendix E. Please note that these 
interactions were not statistically significant — indicating the impact of /-Ready did not differ 
between Black or African American and non-Black or African American student groups (i.e., we 
expect /-Ready to be equally as advantageous to both Black or African American and non-Black 
or African American striving learners). In addition, although gain scores provide a good 
approximation of the achievement growth, they do not provide model-based estimates of group 
differences or impacts. 
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Table F.2. Baseline to Outcome Change in Reading Diagnostic Performance for Black or 
African American and Non-Black or African American Striving Learners in i-Ready 
(Treatment) Schools Compared to Comparison Schools, by Grade. 


Students 


DJ F-Te]ale)sii(e3 


Baseline 
Witexeal 


DJ F-Tealeys1i16 
(@)U] (ore) nals) 
Witexeal 


Baseline to 
(@]U] Kexe)a at) 
(erlia 


Black or African i-Ready 1,510 394.79 451.36 56.57 
Ametean Comparison 794 394.78 448.29 53.51 
Not Black or African i-Ready 5,796 396.28 454.68 58.40 
pumice Comparison 4,407 397.04, 449.65 52.62 
Black or African i-Ready 2,654 435.92 484.10 48.18 
Ame tean Comparison 1,494 432.39 478.14 45.76 
Not Black or African i-Ready 7,442 435.15 488.42 53.27 
pncHean Comparison 5,798 436.56 481.96 45.39 
Black or African i-Ready 2,399 459.14 496.55 37.41 
pmeHes Comparison 1,354 457.32. 490.93 33.61 
Not Black or African | i-Ready 6,397 457.61 500.84 43.23 
peyical Comparison 4,638 459.11 496.24 37.13 
Black or African i-Ready 4,247 498.53 532.64 34.11 
emeiean Comparison 2208 496.75. 528.52 31.76 
Not Black or African | i-Ready 12,368 501.36 536.62 35.27 
eMac Comparison 9,121 503.33) 531.53 28.21 
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Black Interaction Model - Grade 2 Reading 
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Figure F.1. Gains in Reading Diagnostic achievement between baseline and outcome for 
the i-Ready (treatment) and comparison groups for Black or African American and non- 
Black or African American groups at grade 2. 
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Figure F.2. Gains in Reading Diagnostic achievement between baseline and outcome for 
the i-Ready (treatment) and comparison groups for Black or African American and non- 
Black or African American groups at grade 3. 
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Black Interaction Model - Grade 4 Reading 
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Figure F.3. Gains in Reading Diagnostic achievement between baseline and outcome for 
the i-Ready (treatment) and comparison groups for Black or African American and non- 
Black or African American groups at grade 4. 


Black Interaction Model - Grade 5 Reading 
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Figure F.4. Gains in Reading Diagnostic achievement between baseline and outcome for 
the i-Ready (treatment) and comparison groups for Black or African American and non- 
Black or African American groups at grade 5. 
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Appendix G 
Exploratory HLM with Hispanic Origin by Treatment Interaction 


Table G.1. HLM Results for Grade 2 Reading with Hispanic Origin by Treatment Interaction 


Covariates Coef. SE z p-value 95% Conf. Interval 
Student-Level Covariates 
Fall 2018 i-Ready Diagnostic Score 0.82 0.01 61.70 <0.001 0.79 0.84 
Hispanic Origin -1.27 1.04 -1.23 0.220 -3.31 0.76 
Hispanic Origin by Treatment 0.06 1.34 0.04 0.966 -2.57 2.69 
School-Level Covariates 
Treatment Group Membership 4.72 1.00 4.73 <0.001 2.77 6.68 
Urbanicity — City* 0.18 1.59 0.11 0.910 -2.94 3.30 
Urbanicity — Suburban* -0.08 | 1.48 -0.06 0.954 -2.98 2.81 
Urbanicity — Town* -1.26 | 1.99 -0.63 0.527 -5.16 2.64 
Percent Free or Reduced Lunch -15.21 2.23 -6.83 <0.001 -19.58 -10.85 
Percent HMR -7.55 | 2.18 -3.46 0.001 -11.82 -3.27 
Grade-level Enrollment 0.00 0.01 0.15 0.884 -0.02 0.03 
Intercept 
Intercept 137.24 561) 24.48 0 126.25 148.23 


Notes: *Rural is the reference group for urbanicity; HMR = historically marginalized race 


Table G.2. HLM Results for Grade 3 Reading with Hispanic Origin by Treatment Interaction 


Covariates Coef. SE z p-value 95% Conf. Interval 
Student-Level Covariates 
Fall 2018 i-Ready Diagnostic Score 0.86 0.01 | 108.91 <0.001 0.85 0.88 
Hispanic Origin -2.23 | 0.88 -2.54 0.011 -3.94 -0.51 
Hispanic Origin by Treatment 0.77 | 1.14 0.68 0.495 -1.45 3.00 
School-Level Covariates 
Treatment Group Membership 6.63 | 0.84 7.89 <0.001 4.98 8.28 
Urbanicity — City* 0.69 1.31 0.53 0.598 -1.88 3.26 
Urbanicity — Suburban* 186 1.22 1.52 0.128 -0.53 4.24 
Urbanicity — Town* -2.47 | 1.80 -1.38 0.169 -5.99 1.05 
Percent Free or Reduced Lunch -9.24 | 1.89 -4.89 <0.001 -12.94 -5.54 
Percent HMR -14.53 | 1.81 -8.01 <0.001 -18.09 -10.98 
Grade-level Enrollment 0.03 0.01 3.66 <0.001 0.02 0.05 
Intercept 
Intercept 112.06 3.82 | 29.35 <0.001 104.57 119.54 


Notes: *Rural is the reference group for urbanicity; HMR = historically marginalized race 
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Table G.3. HLM Results for Grade 4 Reading with Hispanic Origin by Treatment Interaction 


Covariates Coef. SE z p-value 95% Conf. Interval 
Student-Level Covariates 
Fall 2018 i-Ready Diagnostic Score 0.87 | 0.01 | 112.55 <0.001 0.85 0.88 
Hispanic Origin -2.02 0.95 -2.14 0.032 -3.88 -0.17 
Hispanic Origin by Treatment 1.95 1.23 1.59 0.112 -0.46 4.36 
School-Level Covariates 
Treatment Group Membership 4.50 | 0.89 5.07 <0.001 2.76 6.25 
Urbanicity — City* 0.56 1.43 0.39 0.696 -2.24 3.36 
Urbanicity — Suburban* 0.34 1.35 0.25 0.801 -2.31 2.99 
Urbanicity — Town* -3.08 | 1.82 -1.70 0.090 -6.65 0.48 
Percent Free or Reduced Lunch -7.75 | 2.04 -3.80 <0.001 -11.76 -3.75 
Percent HMR -7.59 | 1.89 -4.01 <0.001 -11.29 -3.88 
Grade-level Enrollment 0.04 0.01 3.71 <0.001 0.02 0.06 
Intercept 
Intercept 101.95 3.96 25.74 <0.001 94.19 109.72 


Notes: *Rural is the reference group for urbanicity; HMR = historically marginalized race 


Table G.4. HLM Results for Grade 5 Reading with Hispanic Origin by Treatment Interaction 


Covariates Coef. SE z p-value 95% Conf. Interval 
Student-Level Covariates 
Fall 2018 i-Ready Diagnostic Score 0.82 0.00 | 163.83 <0.001 0.81 0.83 
Hispanic Origin -0.57 | 0.67 -0.85 0.398 -1.89 0.75 
Hispanic Origin by Treatment 1.74 0.87 2.00 0.046 0.03 3.45 
School-Level Covariates 
Treatment Group Membership 4.63 | 0.70 6.58 <0.001 3.25 6.01 
Urbanicity — City* -0.98 | 1.08 -0.91 0.363 -3.11 1.14 
Urbanicity — Suburban* -0.41 | 1.01 -0.41 0.685 -2.39 1.57 
Urbanicity — Town* -2.21 | 1.40 -1.58 0.114 -4.96 0.53 
Percent Free or Reduced Lunch -12.42 1.59 -7.80 <0.001 -15.54 -9.30 
Percent HMR -3.33 1.50 -2.22 0.026 -6.26 -0.39 
Grade-level Enrollment 0.00 0.01 -0.60 0.549 -0.02 0.01 
Intercept 
Intercept 127.97 2.85 | 44.87 <0.001 122.38 133.55 


Notes: *Rural is the reference group for urbanicity; HMR = historically marginalized race 
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Appendix H. Impacts and Baseline to Outcome Gains for Hispanic Origin by 
Treatment Interactions 


Table H.1 summarizes the differences in achievement for striving learners of Hispanic origin in 
the i-Ready and comparison groups. We did not find a statistically significant interaction at 
grades 2-4, so the difference between these two groups is not statistically significantly different. 
For these grades, we noted adjusted mean differences between 4.78 (grade 2) and 7.41 (grade 
3) points between the i-Ready and comparison groups, and effect sizes ranged from 0.12 
(grade 2) to 0.17 (grade 3). The improvement indices fell between 4.78 (grade 2) and 6.75 
(grade 3). We found a positive significant interaction at grade 5 — indicating differences in the 
impact of i-Ready on students of Hispanic origin compared to students not of Hispanic origin. 
We noted an adjusted mean difference of more than 6 points between these two groups and an 
effect size of 0.14. This is considered a modest effect for an education intervention based on 
Kraft’s guidelines (2019). The improvement index was 5.57. This suggests that a student of 
Hispanic origin in the comparison group in this study would be expected to improve by more 
than five percentiles if they were to use /-Ready with fidelity. 


Table H.1. Differences between i-Ready (Treatment) and Comparison Hispanic Origin 
groups, by Grade 


DJ FeTe[ates\i(es) B)r-\e | avessia(es Adj. Wi tere tal Effect Improvement 


i-Ready 3,087 453.90 40.11 4.78 0.12 4.78 
Comparison 2,063, 449.12 ofa i 

: i-Ready 4,106 486.16 44.12 7.41 0.17 6.75 
Comparison 2,601 478.76 44.03 

jj i-Ready 3,691 498.91 46.74 6.45 0.14 5.57 
Comparison 2,231 492.45 44.08 

- i-Ready 6,673 536.05 46.15 6.37" 0.14 5.57 


Comparison 4,049 529.68 44.45 


Notes: * = statistically significant interaction; SD = standard deviation of Diagnostic scores; Adj Mean Diff = adjusted 
mean difference between i-Ready and Comparison groups; SE = standard error of the adjusted mean difference; and 
Effect Size = Hedge’s g. 


Table H.2 provides the mean change in the reading Diagnostic between baseline and outcome for 
students of Hispanic origin and students not of Hispanic origin -Ready and comparison groups by 
grade. They are illustrated in Figures H.1I-H.4. The means were generated by the interaction 
models presented in Tables G.1— G.4 of Appendix G. Please note that these interactions were not 
statistically significant at grades 2 — 4, indicating the impact of i-Ready did not differ between 
Hispanic origin and non-Hispanic origin student groups (i.e., we expect /-Ready to be equally as 
advantageous to both students of Hispanic origin and students not of Hispanic origin). We did find 
a positive statistically significant interaction at grade 5, indicating students of Hispanic origin saw 
greater benefit using /-Ready than students not of Hispanic origin. As shown, striving learners of 
Hispanic origin showed gains that surpasseds the gains of the comparison groups and of the non- 
Hispanic origin treatment group. Although gain scores provide a good approximation of the 
achievement growth, they do not provide model-based estimates of group differences or impacts. 
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Table H.2. Baseline to Outcome Change in Reading Diagnostic Performance for Hispanic 
Origin and Non-Hispanic Origin Students in i-Ready (Treatment) Schools Compared to 
Comparison Schools, by grade. 


DJ Fle] aleysii(e Diagnostic | Baseline to 
Students Baseline Outcome Outcome 
Coker al Mean (ereTin 
Hispanic i-Ready 3,087 393.25 453.90 60.65 
Onan Comparison 2,063 394.68 449.12 54.44 
Not Hispanic i-Ready 5,131 397.05 455.12 58.07 
Ongin Comparison 4,024 397.17 450.40 53.22 
Hispanic i-Ready 4,106 432.36 486.16 53.80 
Onaln Comparison 2,601 433.28 478.76 45.48 
Not Hispanic i-Ready 7,079 435.96 487.61 51.66 
enol Comparison 5,720 436.13 480.98 44.85 
Hispanic i-Ready 3,691 454.60 498.91 44.31 
ong Comparison 2,231 456.18 492.45 36.27 
Not Hispanic i-Ready 5,816 459.01 498.98 39.97 
Origia Comparison 4,771 458.46 494.48 36.02 
Hispanic i-Ready 6,673 497.15 536.05 38.90 
Chole Comparison 4,049 498.35 529.68 eile 
Not Hispanic i-Ready 10,822 501.39 534.88 33.49 
tin Comparison 8,570 502.19 530.25 28.06 
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Figure H.1. Gains in Reading Diagnostic achievement between baseline and outcome for 
the i-Ready (treatment) and comparison groups for students of Hispanic origin and non- 
Hispanic origin at grade 2. 


Hispanic Origin Interaction Model - Grade 3 Reading 
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Figure H.2. Gains in Reading Diagnostic achievement between baseline and outcome for 
the i-Ready (treatment) and comparison groups for students of Hispanic origin and non- 
Hispanic origin at grade 3. 
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Figure H.3. Gains in Reading Diagnostic achievement between baseline and outcome for 
the i-Ready (treatment) and comparison groups for students of Hispanic origin and non- 
Hispanic origin at grade 4. 


Interaction Model - Grade 5 Reading 
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Figure H.4. Gains in Reading Diagnostic achievement between baseline and outcome for 
the i-Ready (treatment) and comparison groups for students of Hispanic origin and non- 
Hispanic origin at grade 5. 
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Appendix | 
Model Assumption Checks 


We examined three model assumptions associated with two-level HLM—residual normality, 
independence, and homoscedasticity—using the MIXED_DX macro in SAS (Bell, Smiley, Ene, 
& Blue, 2014) based on the baseline analytic model for all four grade levels of this study. The 
MIXED_DX macro provides visual output including box-and-whisker plots, histograms, scatter 
plots, and summary tables to examine residual normality, linearity, homoscedasticity, and 
influential outliers. The macro provides this information for level 1 and level 2 residuals. 


We reviewed plots and summary tables at level 1 and level 2 for each grade level. These 
checks provided assurance that our analytic model was appropriate for our data. We examined 
histograms, box and whisker plots, and scatter plots to check residual normality. These plots 
supported that our residuals were generally normally distributed, particularly, the histograms of 
level 2 residuals produced highly symmetrical bell shape with little skewness or kurtosis. The 
level 1 residuals had some skewness but were close enough to normal to allow confidence. 
There was no evidence when examining level 1 residuals of clearly non-normal distributions 
such as a bi-modal distribution. Violation of assumptions of normality of level 1 residuals can 
adversely affect estimation of random effect coefficients and variance-covariance components, 
but typically will not adversely affect estimation of standard errors and, therefore, inferences 
regarding statistical significance. Given the primary purpose of the models was estimating 
treatment effects, the slight lack of normality of the level 1 residuals likely did not have 
implications for the findings presented in this report. 


Scatter plots of predicted values against residuals at level 1 and level 2 clearly illustrated 
random distributions and provided support for that assumptions regarding independence and 
homoscedasticity were not violated. 
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